Aim: check quality of chosen variables for panel analysis (time series: TS).

The notebook takes some time to run, because of the facet_wrap by country, so if you don’t want to re-run code, just read the html!


library(tidyverse)
library(naniar)


# variables of interest:
vars = c('vdem_edcomp_thick', 'vdem_partip', 'vdem_corr', 'vdem_egal', 'fh_cl', 'fh_pr')

# data set:
ts <- readRDS("../../../Data/Data for Modelling/LEGTER_ts.rds")

# select only columns of interest:
ts <- ts %>%
  select(year, 
         consolidated_country,
         one_of(vars)
         )

# glimpse(ts)
ts

It seems that, by just quickly looking at the data, the Varieties of Democracy (V-Dem) variables have a higher granularity and vary over the years, whereas the Freedom of House (fh) vary less over the years.

Check NAs

Let’s see if we have many NAs, and how they are distributued.

We have all years and countries and we miss a few combinations for the V-Dem and FH variables.


# tmp <- na.omit(ts)

vis_miss(ts)

Which countries are missing data? First we can explore all observations with missing values:


ts[!complete.cases(ts), ]
NA

And those are the countries with missing values. We have a few large countries in there (France, Pakistan, Malaysia, …).

==> TO DO: check if correct that no V-Dem and FH data for France and Pakistan


ts[!complete.cases(ts), ]$consolidated_country %>% unique()
 [1] Bahamas                  Bahrain                  Belize                  
 [4] Congo Republic           Cyprus                   Ethiopia                
 [7] France                   Hong Kong                Malaysia                
[10] North Macedonia          Pakistan                 Slovak Republic         
[13] St. Lucia                Swaziland                West Bank and Gaza Strip
[16] Western Sahara           Yugoslavia              
241 Levels: Afghanistan Albania Algeria American Samoa Andorra Angola ... Zimbabwe

Check Distributions

Looking at the distribution of the variables of interest:

ts %>%
  select(fh_cl, fh_pr) %>%
  gather() %>% 
  ggplot(aes(value)) +
  facet_wrap(~ key, scales = "free") +
  geom_histogram(stat = "count") +
  theme_minimal() + 
  ggtitle("Distributions for Freedom's House Index variables")
Ignoring unknown parameters: binwidth, bins, pad

ts %>%
  select(vdem_edcomp_thick, vdem_partip, vdem_corr, vdem_egal) %>%
  gather() %>% 
  ggplot(aes(value)) +
  facet_wrap(~ key, scales = "free") +
  geom_histogram(stat = "bin", bins = 20) +
  theme_minimal() + 
  ggtitle("Distributions for V-Dem's variables, 20 bins")

ts %>%
  select(vdem_edcomp_thick, vdem_partip, vdem_corr, vdem_egal) %>%
  gather() %>% 
  ggplot(aes(value)) +
  facet_wrap(~ key, scales = "free") +
  geom_density() +
  theme_minimal() + 
  ggtitle("Distributions for V-Dem's variables, density")

LS0tCnRpdGxlOiAiQ2hlY2sgRGF0YSBmb3IgVFMiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCkFpbTogY2hlY2sgcXVhbGl0eSBvZiBjaG9zZW4gdmFyaWFibGVzIGZvciBwYW5lbCBhbmFseXNpcyAodGltZSBzZXJpZXM6IFRTKS4KClRoZSBub3RlYm9vayB0YWtlcyBzb21lIHRpbWUgdG8gcnVuLCBiZWNhdXNlIG9mIHRoZSAqZmFjZXRfd3JhcCogYnkgY291bnRyeSwgc28gaWYgeW91IGRvbid0IHdhbnQgdG8gcmUtcnVuIGNvZGUsIGp1c3QgcmVhZCB0aGUgaHRtbCEKCmBgYHtyIHNldHRpbmdzfQoKbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkobmFuaWFyKQoKCiMgdmFyaWFibGVzIG9mIGludGVyZXN0Ogp2YXJzID0gYygndmRlbV9lZGNvbXBfdGhpY2snLCAndmRlbV9wYXJ0aXAnLCAndmRlbV9jb3JyJywgJ3ZkZW1fZWdhbCcsICdmaF9jbCcsICdmaF9wcicpCgojIGRhdGEgc2V0Ogp0cyA8LSByZWFkUkRTKCIuLi8uLi8uLi9EYXRhL0RhdGEgZm9yIE1vZGVsbGluZy9MRUdURVJfdHMucmRzIikKCiMgc2VsZWN0IG9ubHkgY29sdW1ucyBvZiBpbnRlcmVzdDoKdHMgPC0gdHMgJT4lCiAgc2VsZWN0KHllYXIsIAogICAgICAgICBjb25zb2xpZGF0ZWRfY291bnRyeSwKICAgICAgICAgb25lX29mKHZhcnMpCiAgICAgICAgICkKCiMgZ2xpbXBzZSh0cykKdHMKYGBgCgpJdCBzZWVtcyB0aGF0LCBieSBqdXN0IHF1aWNrbHkgbG9va2luZyBhdCB0aGUgZGF0YSwgdGhlIF9WYXJpZXRpZXMgb2YgRGVtb2NyYWN5IChWLURlbSlfIHZhcmlhYmxlcyBoYXZlIGEgaGlnaGVyIGdyYW51bGFyaXR5IGFuZCB2YXJ5IG92ZXIgdGhlIHllYXJzLCB3aGVyZWFzIHRoZSBfRnJlZWRvbSBvZiBIb3VzZSAoZmgpXyB2YXJ5IGxlc3Mgb3ZlciB0aGUgeWVhcnMuIAoKCiMjIENoZWNrIE5BcwoKTGV0J3Mgc2VlIGlmIHdlIGhhdmUgbWFueSBOQXMsIGFuZCBob3cgdGhleSBhcmUgZGlzdHJpYnV0dWVkLiAgCgpXZSBoYXZlIGFsbCB5ZWFycyBhbmQgY291bnRyaWVzIGFuZCB3ZSBtaXNzIGEgZmV3IGNvbWJpbmF0aW9ucyBmb3IgdGhlIFYtRGVtIGFuZCBGSCB2YXJpYWJsZXMuICAKCgpgYGB7ciB2aXNfTkF9CgojIHRtcCA8LSBuYS5vbWl0KHRzKQoKdmlzX21pc3ModHMpCgpgYGAKCldoaWNoIGNvdW50cmllcyBhcmUgbWlzc2luZyBkYXRhPyBGaXJzdCB3ZSBjYW4gZXhwbG9yZSBhbGwgb2JzZXJ2YXRpb25zIHdpdGggbWlzc2luZyB2YWx1ZXM6IAoKYGBge3Igb2JzX05BfQoKdHNbIWNvbXBsZXRlLmNhc2VzKHRzKSwgXQoKYGBgCgpBbmQgdGhvc2UgYXJlIHRoZSBjb3VudHJpZXMgd2l0aCBtaXNzaW5nIHZhbHVlcy4gV2UgaGF2ZSBhIGZldyBsYXJnZSBjb3VudHJpZXMgaW4gdGhlcmUgKEZyYW5jZSwgUGFraXN0YW4sIE1hbGF5c2lhLCAuLi4pLgoKKio9PT4gVE8gRE86IGNoZWNrIGlmIGNvcnJlY3QgdGhhdCBubyBWLURlbSBhbmQgRkggZGF0YSBmb3IgRnJhbmNlIGFuZCBQYWtpc3RhbioqCgpgYGB7ciBjdHJ5IF9OQX0KCnRzWyFjb21wbGV0ZS5jYXNlcyh0cyksIF0kY29uc29saWRhdGVkX2NvdW50cnkgJT4lIHVuaXF1ZSgpCgpgYGAKCiMjIENoZWNrIERpc3RyaWJ1dGlvbnMKCkxvb2tpbmcgYXQgdGhlIGRpc3RyaWJ1dGlvbiBvZiB0aGUgdmFyaWFibGVzIG9mIGludGVyZXN0OgoKYGBge3IgZGlzdHJfZmh9CnRzICU+JQogIHNlbGVjdChmaF9jbCwgZmhfcHIpICU+JQogIGdhdGhlcigpICU+JSAKICBnZ3Bsb3QoYWVzKHZhbHVlKSkgKwogIGZhY2V0X3dyYXAofiBrZXksIHNjYWxlcyA9ICJmcmVlIikgKwogIGdlb21faGlzdG9ncmFtKHN0YXQgPSAiY291bnQiKSArCiAgdGhlbWVfbWluaW1hbCgpICsgCiAgZ2d0aXRsZSgiRGlzdHJpYnV0aW9ucyBmb3IgRnJlZWRvbSdzIEhvdXNlIEluZGV4IHZhcmlhYmxlcyIpCgpgYGAKCgpgYGB7ciBkaXN0cl92ZGVtX2Jpbn0KdHMgJT4lCiAgc2VsZWN0KHZkZW1fZWRjb21wX3RoaWNrLCB2ZGVtX3BhcnRpcCwgdmRlbV9jb3JyLCB2ZGVtX2VnYWwpICU+JQogIGdhdGhlcigpICU+JSAKICBnZ3Bsb3QoYWVzKHZhbHVlKSkgKwogIGZhY2V0X3dyYXAofiBrZXksIHNjYWxlcyA9ICJmcmVlIikgKwogIGdlb21faGlzdG9ncmFtKHN0YXQgPSAiYmluIiwgYmlucyA9IDIwKSArCiAgdGhlbWVfbWluaW1hbCgpICsgCiAgZ2d0aXRsZSgiRGlzdHJpYnV0aW9ucyBmb3IgVi1EZW0ncyB2YXJpYWJsZXMsIDIwIGJpbnMiKQpgYGAKCgpgYGB7ciBkaXN0cl92ZGVtX2RlbnNpdHl9CnRzICU+JQogIHNlbGVjdCh2ZGVtX2VkY29tcF90aGljaywgdmRlbV9wYXJ0aXAsIHZkZW1fY29yciwgdmRlbV9lZ2FsKSAlPiUKICBnYXRoZXIoKSAlPiUgCiAgZ2dwbG90KGFlcyh2YWx1ZSkpICsKICBmYWNldF93cmFwKH4ga2V5LCBzY2FsZXMgPSAiZnJlZSIpICsKICBnZW9tX2RlbnNpdHkoKSArCiAgdGhlbWVfbWluaW1hbCgpICsgCiAgZ2d0aXRsZSgiRGlzdHJpYnV0aW9ucyBmb3IgVi1EZW0ncyB2YXJpYWJsZXMsIGRlbnNpdHkiKQpgYGAKCgoKIyMgQ2hlY2sgVHJlbmRzCgpXZSB3YW50IHRvIGxvb2sgYXQgdGhlIHZhcmlhYmlsaXR5IG92ZXIgdGltZSBvZiB0aGUgdmFyaWFibGVzLgoKIyMjIHZkZW1fZWRjb21wX3RoaWNrCgpgYGB7ciB0cmVuZHNfdmRlbV9lZGNvbXBfdGhpY2ssIGZpZy53aWR0aCA9IDEwLCBmaWcuaGVpZ2h0ID0gMTV9CnRzICU+JQogIGdncGxvdChhZXMoeD15ZWFyLCB5PXZkZW1fZWRjb21wX3RoaWNrKSkgKwogIGdlb21fbGluZSgpICsKICB0aGVtZV9taW5pbWFsKCkgKyAKICB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAibm9uZSIpICsKICBnZ3RpdGxlKCJ2ZGVtX2VkY29tcF90aGljazogdHJlbmRzIGJ5IGNvdW50cnkiKSArCiAgZmFjZXRfd3JhcCh+IGNvbnNvbGlkYXRlZF9jb3VudHJ5KQoKYGBgCgojIyMgdmRlbV9wYXJ0aXAKCmBgYHtyIHZkZW1fcGFydGlwLCBmaWcud2lkdGggPSAxMCwgZmlnLmhlaWdodCA9IDE1fQp0cyAlPiUKICBnZ3Bsb3QoYWVzKHg9eWVhciwgeT12ZGVtX3BhcnRpcCkpICsKICBnZW9tX2xpbmUoKSArCiAgdGhlbWVfbWluaW1hbCgpICsgCiAgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKSArCiAgZ2d0aXRsZSgidmRlbV9wYXJ0aXA6IHRyZW5kcyBieSBjb3VudHJ5IikgKwogIGZhY2V0X3dyYXAofiBjb25zb2xpZGF0ZWRfY291bnRyeSkKYGBgCgoKIyMjIHZkZW1fY29ycgoKYGBge3IgdmRlbV9jb3JyLCBmaWcud2lkdGggPSAxMCwgZmlnLmhlaWdodCA9IDE1fQp0cyAlPiUKICBnZ3Bsb3QoYWVzKHg9eWVhciwgeT12ZGVtX2NvcnIpKSArCiAgZ2VvbV9saW5lKCkgKwogIHRoZW1lX21pbmltYWwoKSArIAogIHRoZW1lKGxlZ2VuZC5wb3NpdGlvbiA9ICJub25lIikgKwogIGdndGl0bGUoInZkZW1fY29ycjogdHJlbmRzIGJ5IGNvdW50cnkiKSArCiAgZmFjZXRfd3JhcCh+IGNvbnNvbGlkYXRlZF9jb3VudHJ5KQpgYGAKCiMjIyB2ZGVtX2VnYWwKCmBgYHtyIHZkZW1fZWdhbCwgZmlnLndpZHRoID0gMTAsIGZpZy5oZWlnaHQgPSAxNX0KdHMgJT4lCiAgZ2dwbG90KGFlcyh4PXllYXIsIHk9dmRlbV9lZ2FsKSkgKwogIGdlb21fbGluZSgpICsKICB0aGVtZV9taW5pbWFsKCkgKyAKICB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAibm9uZSIpICsKICBnZ3RpdGxlKCJ2ZGVtX2VnYWw6IHRyZW5kcyBieSBjb3VudHJ5IikgKwogIGZhY2V0X3dyYXAofiBjb25zb2xpZGF0ZWRfY291bnRyeSkKYGBgCgoKIyMjIGZoX2NsCgpgYGB7ciBmaF9jbCwgZmlnLndpZHRoID0gMTAsIGZpZy5oZWlnaHQgPSAxNX0KdHMgJT4lCiAgZ2dwbG90KGFlcyh4PXllYXIsIHk9ZmhfY2wpKSArCiAgZ2VvbV9saW5lKCkgKwogIHRoZW1lX21pbmltYWwoKSArIAogIHRoZW1lKGxlZ2VuZC5wb3NpdGlvbiA9ICJub25lIikgKwogIGdndGl0bGUoImZoX2NsOiB0cmVuZHMgYnkgY291bnRyeSIpICsKICBmYWNldF93cmFwKH4gY29uc29saWRhdGVkX2NvdW50cnkpCmBgYAoKIyMjIGZoX3ByCgpgYGB7ciBmaF9wciwgZmlnLndpZHRoID0gMTAsIGZpZy5oZWlnaHQgPSAxNX0KdHMgJT4lCiAgZ2dwbG90KGFlcyh4PXllYXIsIHk9ZmhfcHIpKSArCiAgZ2VvbV9saW5lKCkgKwogIHRoZW1lX21pbmltYWwoKSArIAogIHRoZW1lKGxlZ2VuZC5wb3NpdGlvbiA9ICJub25lIikgKwogIGdndGl0bGUoImZoX3ByOiB0cmVuZHMgYnkgY291bnRyeSIpICsKICBmYWNldF93cmFwKH4gY29uc29saWRhdGVkX2NvdW50cnkpCmBgYAoKCgoK